Hamming Distance based Binary PSO for Feature Selection and Classification from high dimensional Gene Expression Data

نویسندگان

  • Haider Banka
  • Suresh Dara
چکیده

In this article, a Binary Particle Swarm Optimization (BPSO) algorithm is proposed incorporating hamming distance as a distance measure between particles for feature selection problem from high dimensional microarray gene expression data. Hamming distance is used as an similarity measurement for updating the velocities of each particles or solutions. It also helps to reduce extra parameter (i.e. Vmin) as needed in conventional BPSO during velocity updation. An initial fast pre-processing heuristic method is used for crude domain reduction from high dimension. Then the fitness function is suitably designed in multi objective framework for further reduction and soft tuning on the reduced features using BPSO. The performance of the proposed method is tested on three benchmark cancerous datasets (i.e., colon, lymphoma and leukemia cancer). The comparative study is also performed on the existing literature to show the effectiveness of the proposed method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

Feature selection using genetic algorithm for classification of schizophrenia using fMRI data

In this paper we propose a new method for classification of subjects into schizophrenia and control groups using functional magnetic resonance imaging (fMRI) data. In the preprocessing step, the number of fMRI time points is reduced using principal component analysis (PCA). Then, independent component analysis (ICA) is used for further data analysis. It estimates independent components (ICs) of...

متن کامل

Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA

With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...

متن کامل

Improved Multiobjective Binary Biogeography Based Optimization using CVM for Feature Selection Using Gene Expression Data

Gene expression data play an important role in the development of efficient cancer diagnoses and classification. The genes identified are subsequently used to classify independent test set samples. The different feature selection methods are investigated and most frequent features are selected among all methods. This paper provides gene selection strategies for multiclass classification that ca...

متن کامل

SFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy

 In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014